Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Close
You're currently offline. Some features may not work.
Close
Copied to clipboard
Close
Unable to share or copy to clipboard
Close
🧠 LLM Inference
Quantization, Attention Mechanisms, Batch Processing, KV Caching
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
2702
posts in
45.5
ms
Understanding LLM Inference
Engines
: Inside
Nano-vLLM
(Part 2)
neutree.ai
·
17h
·
Discuss:
Hacker News
🤖
Machine Learning
AI attention
span
so good it
shouldn
’t be legal
stackoverflow.blog
·
1d
🤖
Machine Learning
Sequential Attention: Making AI models
leaner
and faster without
sacrificing
accuracy
research.google
·
2d
·
Discuss:
Hacker News
,
r/LocalLLaMA
🤖
Machine Learning
ML-LIB
: Machine Learning Library Proposed For The Linux Kernel
phoronix.com
·
12h
·
Discuss:
Hacker News
🤖
Machine Learning
Continual
learning and the post
monolith
AI era
baseten.co
·
8h
·
Discuss:
Hacker News
🤖
Machine Learning
Databricks adds
MemAlign
to
MLflow
to cut cost and latency of LLM evaluation
infoworld.com
·
1d
🧠
Local llm
A Neuro Symbolic Architecture For Induced
Epistemic
Agency and System 2 Reasoning in
Quantized
Large Language Models
papers.ssrn.com
·
1d
·
Discuss:
Hacker News
🤖
Machine Learning
Building Highly Efficient Inference System for
Recommenders
Using
PyTorch
pytorch.org
·
1d
·
Discuss:
Hacker News
🤖
Machine Learning
The control
layer
for AI
blog.dottxt.ai
·
7h
·
Discuss:
Hacker News
👁️
Observability
ggml
: backend-agnostic tensor parallelism by
JohannesGaessler
· Pull Request #19378
github.com
·
1d
·
Discuss:
r/LocalLLaMA
🐢
Turso
Writing an LLM from scratch, part
32d
--
Interventions
: adding attention bias
gilesthomas.com
·
7h
·
Discuss:
Hacker News
🤖
Machine Learning
Fast
Autoscheduling
for Sparse ML
Frameworks
ajroot.pl
·
2d
·
Discuss:
Hacker News
🤖
Machine Learning
Hypernetworks
: Neural Networks for
Hierarchical
Data
blog.sturdystatistics.com
·
1d
·
Discuss:
Hacker News
🤖
Machine Learning
EBM
vs. LLMs: Our
Kona
EBM
a 96% vs. 2% Sudoku Benchmark
logicalintelligence.com
·
1d
·
Discuss:
Hacker News
🧠
Local llm
Show HN: 32KB
deductive
engine that catches LLM
hallucinations
news.ycombinator.com
·
2d
·
Discuss:
Hacker News
📊
Prometheus
[
RFC
PATCH v1 0/4] Machine Learning (
ML
) library in Linux kernel
lore.kernel.org
·
11h
·
Discuss:
Lobsters
,
Hacker News
🧠
Local llm
As
Rocks
May Think
evjang.com
·
3d
·
Discuss:
Hacker News
,
r/programming
🤖
Machine Learning
Future
leakage
in
block-quantized
attention
matx.com
·
4d
·
Discuss:
Hacker News
🤖
Machine Learning
LlamaLib
: A cross-platform C++/C# library for local LLMs based on
llama.cpp
github.com
·
14h
·
Discuss:
Hacker News
🧠
Local llm
The Trigger in the
Haystack
: Extracting and
Reconstructing
LLM Backdoor Triggers
arxiv.org
·
3d
·
Discuss:
Hacker News
🧠
Local llm
Loading...
Loading more...
Page 2 »
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help